Assortative mixing in networks 
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A network is said to show assortative mixing if the nodes in the network that have many connec- 
tions tend to be connected to other nodes with many connections. We define a measure of assortative 
mixing for networks and use it to show that social networks are often assortatively mixed, but that 
technological and biological networks tend to be disassortative. We propose a model of an assortative 
network, which we study both analytically and numerically. Within the framework of this model we 
find that assortative networks tend to percolate more easily than their disassortative counterparts 
and that they are also more robust to vertex removal. 
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Many systems take the form of networks — sets of ver- 
tices joined together by edges — including social networks, 
computer networks, and biological networks ||, |j. A 
variety of models of networks have been proposed and 
studied in the physics literature, many of which have 
been successful at reproducing features of networks in the 
real world |§, ||, 1). One particularly well-studied model 
is the cumulative advantage or preferential attachment 
model 

[i i, i in which the probability of a given 
source vertex forming a connection to a target vertex is 
some (usually increasing) function of the degree of the 
target vertex. (The degree of a vertex is the number of 
other vertices to which it is attached.) Preferential at- 
tachment processes are widely accepted as the probable 
explanation for the power-law and other skewed degree 
distributions seen in many networks 0, [T4| 



However, there is an important element missing from 
these as well as other network models: in none of these 
models does the probability of attachment to the target 
vertex depend also on the degree of the source vertex. In 
the real world on the other hand such dependencies are 
common. Many networks show "assortative mixing" on 
their degrees, i.e., a preference for high-degree vertices 
to attach to other high-degree vertices. Others show dis- 
assortative mixing — high-degree vertices attach to low- 
degree ones. In this paper we first demonstrate the pres- 
ence of assortative mixing in a variety of networks by 
direct measurement, and then argue, using exactly solv- 
able models and numerical simulations, that assortative 
mixing can have a substantial effect on the behavior of 
networked systems. Models that do not take it into ac- 
count will necessarily fail to reproduce correctly many of 
the behaviors of real-world networked systems. 

Consider then a network, represented in the simplest 
case by an undirected graph of iV vertices and M edges, 
with degree distribution pk- That is, pk is the probability 
that a randomly chosen vertex on the graph will have 
degree k. Now consider a vertex reached by following a 
randomly chosen edge on the graph. The degree of this 
vertex is not distributed according to pk- Instead it is 
biased in favor of vertices of high degree, since more edges 
end at a high-degree vertex than at a low-degree one. 
This means that the degree distribution for the vertex at 
the end of a randomly chosen edge is proportional kpk, 



rather than just pk- In this paper, we will usually be 
interested not in the total degree of such a vertex, but in 
the remaining degree — the number of edges leaving the 
vertex other than the one we arrived along. This number 
is one less than the total degree and hence is distributed 
in proportion to (fc -|- l)pk+i- The correctly normalized 
distribution qk of the remaining degree is then 

_ (fc + l)pk+i , . 

Qk - — • (1) 

Following Callaway et al. ||l5|] , we now define the quan- 
tity Cjk to be the joint probability distribution of the 
remaining degrees of the two vertices at either end of a 
randomly chosen edge [3^ . This quantity is symmetric in 
its indices on an undirected graph ejk = e^j, and obeys 
the sum rules 



= 1, 



<i]k = qk- 
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In a network with no assortative (or disassortative) 
mixing ejk takes the value qjqk- If there is assortative 
mixing, ejk will differ from this value and the amount 
of assortative mixing can be quantified by the con- 
nected degree-degree correlation function {jk) — (j) (k) = 
Sjfe j^i^jk ~ <lj(lk), where (...) indicates an average over 
edges This correlation function is zero for no as- 

sortative mixing and positive or negative for assorta- 
tive or disassortative mixing respectively. For the pur- 
poses of comparing different networks, it is convenient 
to normalize it by dividing by its maximal value, which 
it achieves on a perfectly assortative network, i.e., one 
with Cjk — qk^jk- This value is equal to the variance 

~ J2k^^1k ~ [J2k^1k]'^ of ^^^^ distribution qk, and 
hence the normalized correlation function is 

which is simply the Pearson correlation coefficient of the 
degrees at cither ends of an edge and lies in the range 
— 1 < r < 1 |35| . For the practical purpose of evaluating 
r on an observed network, we can rewrite (0) as 



(4) 
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network 


n 


r 




physics coauthorship^ 


52 909 


0.363 


CO 


biology coauthorship^ 


1520 251 


0.127 




mathematics coauthorship*^ 


253 339 


0.120 
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film actor collaborations'^ 


449 913 


0.208 


company directors 


7 673 


0.276 




Internet"^ 


lU d9( 


A 1 on 
—0.189 
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World-Wide Web 


269 504 


—0.065 




protein interactions^ 


2115 


-0.156 
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neural network^ 
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food web' 


92 


-0.276 




random graph" 







mode 


Callaway et al/ 
Barabasi and Albert™ 




5/(1 + 25) 




TABLE I: Size n and assortativity coefficient r for a num- 
ber of different networks: collaboration networks of (a) sci- 
entists in physics and biology (b) mathematicians |^|, 
(c) film actors Q , and (d) businesspeople ||l^ ; (e) connections 
between autonomous systems on the Internet [y_9|; (f) undi- 
rected hyperlinks between Web pages in a single domain Q; 
(g) protein-protein interaction network in yeast jioj ; (h) undi- 
rected (and unweighted) synaptic connections in the neu- 
ral network of the nematode C. Elegans Q; (i) undirected 
trophic relations in the food web of Little Rock Lake, Wis- 
consin The last three lines give analytic results for model 
networks in the limit of large network size: (u) the random 
graph of Erdos and Renyi p3] ; (v) the grown graph model of 



Callaway et al. [ 
Barabasi and A 



tl5[; (w 
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the preferential attachment model of 



where jt , ki are the degrees of the vertices at the ends of 
the ith edge, with i = l...M 

In Table | we show values of r for a variety of real- world 
networks. As the table shows, of the social networks 
studied (the top five entries in the table) all have signifi- 
cant assortative mixing, which accords with accepted wis- 
dom within the sociological community. By contrast, the 
technological and biological networks studied (the mid- 
dle five entries) all have disassortative mixing — high de- 
gree vertices preferentially connect with low degree ones 
and vice versa. Various explanations for this observation 
suggest themselves. In the case of the Internet, for ex- 
ample, it appears that the high degree vertices mostly 
represent connectivity providers — telephone companies 
and other communications carriers — who typically have 
a large number of connections to clients who themselves 
have only a single connection Thus the high-degree 
vertices do indeed tend to be connected to the low-degree 
ones. 

We have also calculated r analytically for three mod- 
els of networks: (1) the random graph of Erdos and 
Renyi |^2| , in which edges are placed at random between 
a fixed set of vertices; (2) the grown graph model of 
Callaway et al. in which both edges and vertices 

are added at random at constant but possibly different 
rates, the ratio of the rates being denoted 5; (3) the grown 
graph model of Barabasi and Albert Q , in which both 
edges and vertices are added, and one end of each edge 
is added with linear preferential attachment. 



For the random graph, since edges are placed at ran- 
dom without regard to vertex degree it follows trivially 
that r = in the limit of large graph size. The model 
of Callaway et al. however, although apparently similar 
in construction, gives a markedly different result. From 



Eq. (21) of Ref. 
rence relation 
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for this model satisfies the recur- 



(1 -I- 4(5)ejfc = 25{e.j^i^k + e],k-i) + PjPk, 



(5) 



and the degree distribution is pk = (26)'^ /{I + 2(5)*^+^. 
Substituting into Eq. (^) and making use of Eq. (||) , we 
then find that r = (5/(1-1- 26). Thus the model shows 
significant assortative mixing, with a maximum value of 
r — ^ in the limit of large 5. This agrees with intu- 
ition [ p5| : in the grown graph the older vertices have 
higher degree and also tend to have higher probability 
of being connected to one another, simply by virtue of 
being around for longer. Thus one would expect positive 
assortative mixing. 

The model of Barabasi and Albert ||] provides an inter- 
esting counter-example to this intuition. Although this 
is a grown graph model, in which again older vertices 
have higher degree [23| , it shows no assortative mixing at 
all. Making use of Eq. (42) of Ref. |2^ we can show that 
ejk for the model of Barabasi and Albert goes asymp- 
totically as l/(j^fc^) — 6/(j + A:)^ in the limit of large j 
and fc, which implies that r ^ as (log^N)/N as N 
becomes large. The model of Barabasi and Albert has 
been used as a model of the structure of the Internet and 
the World-Wide Web. Since these networks show signif- 
icant disassortative mixing however (Table ^) , it is clear 
that the model is incomplete. It is an interesting open 
question what type of network evolution processes could 
explain the values of r observed in real- world networks. 

Turning now to theoretical developments, we propose 
a simple model of an assortatively mixed network, which 
is exactly solvable for many of its properties in the limit 
of large graph size. Consider the ensemble of graphs in 
which the distribution ejk takes a specified value. This 
defines a random graph model similar in concept to the 
random graphs with specified degree sequence ||^, ^ , 
except for the added element of assortative mixing. 

Consider a typical member of this ensemble in the limit 
of large graph size, and consider a randomly chosen edge 
in that graph, one end of which is attached to a vertex of 
degree j. We ask what the probability distribution is of 
the number of other vertices reachable by following that 
edge. Let this probability distribution be generated by a 
generating function Gj{x), which depends in general on 
the degree j of the starting vertex. By arguments similar 
to those of Ref. || we can show that Gj (x) must satisfy 
a self-consistency condition of the form 



G-i{x) = X 



Efcejfe[G'fc(a 



(6) 



while the number of vertices reachable from a randomly 
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chosen vertex is generated by 



H{x) ^ xpo + x'^pk[Gk-i{x)]' 



(7) 



k=l 



The average size of the component to which such a vertex 
belongs is given by the derivative of H: (s) = H'{1) — 
1 + X]fe kPkG'j^_^{l). Differentiating Eq. (g) we then get 



= 1 — zq • A • q, 



(8) 



where z is the mean degree, q is the vector whose ele- 
ments are the qu, and A is the asymmetric matrix with 
elements Ajk = ke^k - QkSjk- 

Equation (||) diverges at the point at which the deter- 
minant of A is zero. This point marks the phase trans- 
ition at which a giant component forms in our graph. By 
considering the behavior of Eq. (||) close to the transition, 
where (s) must be large and positive in the absence of 
a giant component, we deduce that a giant component 
exists in the network when detA > 0. This is the ap- 
propriate generalization for a network with assortative 
mixing of the criterion of MoUoy and Reed |26 for the 
existence of a giant component. 

To calculate the size S of the giant component, we 
define Uk to be the probability that an edge connected 
to a vertex of remaining degree k leads to another vertex 
that does not belong to the giant component. Then 



fc=i 



(9) 



As with most other random graph models, including the 
original model of Erdos and Renyi, it is usually not pos- 
sible to solve for S in closed form, but we can determine 
it by numerical iteration from a suitable set of starting 
values for Uk- 

To test these results and to help form a more complete 
picture of the properties of assortatively mixed networks, 
we have also performed computer simulations, generat- 
ing networks with given values of Cjk and measuring their 
properties directly. Generating such networks is not en- 
tirely trivial. One cannot simply draw a set of degree 
pairs {ji,ki) for edges i from the distribution e^-fc, since 
such a set would almost certainly fail to satisfy the basic 
topological requirement that the number of edges end- 
ing at vertices of degree k must be a multiple of k. In- 
stead therefore we propose the following Monte Carlo 
algorithm for generating graphs. 

First, we generate a random graph with the desired 
degree distribution according to the prescription given in 
Ref. Then we apply a Metropolis dynamics to the 
graph in which on each step we choose at random two 
edges, denoted by the vertex pairs, (wi, wi) and (^2, ^2), 
that they connect. We measure the remaining degrees 
(ji,fci) and (j2,fc2) for these vertex pairs, and then re- 
place the edges with two new ones (wi,f2) and {'Wi,W2) 
with probability min(l, (ejjj^efcifeJ/Cejifciejafcs))- This 



dynamics conserves the degree sequence, is ergodic on 
the set of graphs having that degree sequence, and, with 
the choice of acceptance probability above, satisfies de- 
tailed balance for state probabilities Gj-ki, and hence 
has the required edge distribution ejk as its fixed point. 
As an example, consider the symmetric binomial form 



°jk 



J + k 



j + k 



, (10) 



where p + q = 1, k > and M = ^(1 — e"^/'*) is a nor- 
malizing constant. (The binomial probabilities p and q 
should not be confused with the quantities pk and qk in- 
troduced earlier.) This distribution is chosen for analytic 
tractability, although its behavior is also quite natural: 
the distribution of the sum j + fc of the degrees at the 
ends of an edge falls off as a simple exponential, while 
that sum is distributed between the two ends binoniially, 
the parameter p controlling the assortative mixing. From 
Eq. (PI), the value of r is 



8m -1 



2eiA - 1 2(p - g)2 



(11) 



which can take both positive and negative values, passing 
through zero when p — p^ — ^ ~ = 0.1464 . . . 

In Fig. |l| we show the size of the giant component for 
graphs of this type as a function of the degree scale pa- 
rameter K, from both our numerical simulations and the 
exact solution above. As the figure shows, the two are in 
good agreement. The three curves in the figure are for 
p = 0.05, where the graph is disassortative, p — pq, where 
it is neutral (neither assortative nor disassortative), and 
p = 0.5, where it is assortative. 

As K, becomes large we see the expected phase trans- 
ition at which a giant component forms. There are two 
important points to notice about the figure. First, the 
position of the phase transition moves lower as the graph 
becomes more assortative. That is, the graph percolates 
more easily, creating a giant component, if the high- 
degree vertices preferentially associate with other high- 
degree ones. Second, notice that, by contrast, the size of 
the giant component for large k is smaller in the assor- 
tatively mixed network. 

These findings are intuitively reasonable. If the net- 
work mixes assortatively, then the high-degree vertices 
will tend to stick together in a subnetwork or core group 
of higher mean degree than the network as a whole. It is 
reasonable to suppose that percolation would occur ear- 
lier within such a subnetwork. Conversely, since perco- 
lation will be restricted to this subnetwork, it is not sur- 
prising that the giant component has a smaller size in this 
case than when the network is disassortative. These re- 
sults could have implications, for example, for the spread 
of disease on social networks |^ — social networks being 
assortatively mixed in many cases, as Table | shows. The 
core group of an assortatively mixed network could form 
a "reservoir" for disease, sustaining an epidemic even in 
cases in which the network is not sufficiently dense on av- 
erage for the disease to persist. On the other hand, one 
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FIG. 1; Size of the giant component as a fraction of grap h size 
for graphs with the edge distribution given in Eq. (uw. The 
points are simulation results for graphs oi N = 100 000 ver- 
tices while the solid lines are the numerical solution of Eq. ^ . 
Each point is an average over ten graphs; the resulting statis- 
tical errors are smaller than the symbols. The values of p are 
0.5 (circles), po — 0.146 . . . (squares), and 0.05 (triangles). 



would expect the disease to be restricted to a smaller 
segment of the population in such cases than for diseases 
spreading on neutral or disassortative networks. 

Assortative mixing also has implications for questions 
of network resilience, the s ubj ect of much discussion in 
the recent literature (28[ |9|, |g, 1^, || . 



It has been found 

that the connectivity of many networks (i.e., the exis- 
tence of paths between pairs of vertices) can be destroyed 
by the removal of just a few of the highest degree vertices, 
a result that may have applications in, for example, vac- 
cination strategies In assortatively mixed networks, 
however, we find numerically that removing high-degree 
vertices is a relatively inefficient strategy for destroying 
network connectivity, presumably because these vertices 
tend to be clustered together in the core group, so that 
removing them is somewhat redundant. In a disassor- 
tative network with a similarly sized giant component 



attacks on the highest degree vertices are much more 
effective, these vertices being broadly distributed over 
the network and presumably therefore forming links on 
many paths between other vertices. For networks of the 
type described by Eq. we find that the number of 
high-degree vertices that need to be removed to destroy 
similarly sized giant components is greater by a factor 
of about five to ten in an assortative network {p = 0.5) 
than in a disassortative one (p = 0.05) for the typical 
parameter values studied here. 

These considerations paint rather a grim picture: the 
networks that we might want to break up, such as the 
social networks that spread disease, appear to be assor- 
tative, and therefore are resilient, at least against simple 
targeted attacks such as attacks on the highest degree 
vertices. And yet at the same time the networks that we 
would wish to protect, including technological networks 
such as the Internet, appear to be disassortative, and are 
hence particularly vulnerable. 

To conclude, in this paper we have studied assortative 
mixing by degree in networks — the tendency for high- 
degree vertices to associate preferentially with other high- 
degree vertices. We have defined a scalar measure of 
assortative mixing and used it to show that many so- 
cial networks have significant assortative mixing, while 
technological and biological networks seem to be disas- 
sortative. We have also proposed a model of an assor- 
tatively mixed network, which wc have solved exactly 
using generating function techniques, and also simulated 
using a Monte Carlo graph sampling method. Within 
this model we find that assortative networks percolate 
more easily and that they are also more robust to removal 
of their highest degree vertices, while disassortative net- 
works percolate less easily and are more vulnerable. This 
suggests that social networks may be robust to interven- 
tion and attack while technological networks are not. 
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